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Abstract 

Suppose some random resource (energy, mass or space) x > is to be shared 
at random between (possibly infinitely many) species (atoms or fragments). 
Assume E^; = 6 < oo and suppose the amount of the individual share is neces- 
sarily bounded from above by 1. This random partitioning model can naturally 
be identified with the study of infinitely divisible random variables with Levy 
measure concentrated on the interval. Special emphasis is put on these special 
partitioning models in the Poisson-Kingman class. The masses attached to the 
atoms of such partitions are sorted in decreasing order. Considering nearest- 
neighbors spacings yields a partition of unity which also deserves special in- 
terest. For such partition models, various statistical questions are addressed 
among which: correlation structure, cumulative energy of the first K largest 
items, partition function, threshold and covering statistics, weighted partition, 
Renyi's, typical and size-biased fragments size. Several physical images are 
supplied. 

When the unbounded Levy measure of x is 9x~^ ■I{x £ (0, 1)) dx, the spac- 
ings partition has Griffiths-Engen-McCloskey or GEM{9) distribution and x 
follows Dickman distribution. The induced partition models have many re- 
markable peculiarities which are outlined. 

The case with finitely many (Poisson) fragments in the partition law is also 
briefly addressed. Here, the Levy measure is bounded. 

KEYWORDS: Random Partitions, Divisibility, Poisson Point Process on 
the Interval. 
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1 Introduction 



Random division models of a population into a (possibly large) number n of 
species, fragments or valleys with random weights or sizes have received considerable 
attention in various domains of applications. 

In disordered systems Physics, it was first recognized as an important issue in 
[1], as a result of phase space (in iterated maps or spin glasses models at thermal 
equilibrium) being typically broken into many valleys, or attraction basins, each 
with random weight. Problems involving the breakdown or splitting of some item 
into random component parts or fragments, also appear in many other fields of 
interest: for example the composition of rocks into component compounds in Geology 
(splitting of mineral grains or pebbles), the composition of biological populations into 
species or the random allocation of memory in Computer Sciences, but also models 
for gene frequencies in population genetics and biological diversity. 

All these applications deal with randomly broken objects and random splitting 
(see also [2] pages 25 and 30 for further motivations). Considering the random weights 
of the various species must sum to one, by normalization, the typical phase-space 
of these models is the interval of unit length, randomly split in such a way that 
the fragments' masses, sizes or energies must sum to one. The random structure of 
the population is then characterized by the ranked sequence of fragments' weights 
or sizes. This was observed in [3] (in the large n thermodynamic limit i.e. with a 
denumerable number of fragments). 

There are of course many ways to break the interval at random into n (possibly 
infinitely many) pieces and so one needs to be more specific. This manuscript is pre- 
cisely devoted to the study of some remarkable partition laws of the interval which 
arise from the partitioning problem of some random variable. 

More precisely, in Section 2, we shall first focus on the simplest "fair" statistical 
model for splitting the interval into a finite number n of fragments. It essentially 
relies on normalization of a sequence of random variables by its sum. In more details, 
let n > 1 be a given integer. With (Sk; k > 1) independent and identically distributed 
positive random variables, consider the partial random walk sum Xn ■= Si + .. + Sn- 
Then S'i,...,5„ constitute a simple random partition of Xn- Normalizing, define 
?fe = Sk/Xn, k = 1, ..,n. Then (^i, ...,«:„) constitutes a random partition of unity. In 
this model, there are 1 < n < oo fragments with exchangeable random sizes (iji, <:„) 
summing up to 1. We shall focus on the special case where Si has gamma(a) dis- 
tribution, with a > 0. In this case, (i;i, ...,<r„) has Dirichlct D„ (a) distribution. Let 
(<j(i), i;(„)) be the ranked version of (cri, <;«), with qi) > ... > ^(„). Passing to 
the weak limit n f oo, a l with na — 9, one gets the ranked Poisson-Dirichlet par- 
tition model PD(9). It may also be obtained from the normalization process of the 
jumps of a Moran subordinator, resulting in a random discrete distribution on the 
infinite simplex; see [4]. We shall recall some of its remarkable properties. The PD 
model exhibits many fundamental invariance properties. For a review of these results 
and applications to Computer Science, Combinatorial Structures, Physics, Biology.., 
see [5] and the references therein for example; this model and related ones are also 
fundamental in Probability Theory; see [6], [7], [8] and [9]. 

Several (not exclusively) interesting partitioning models of the interval are based 
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on such normalizing process in the literature. In the sequel, we shall study different 
types of partitioning models rather based on nearest-neighbor spacings. 

More specifically, in Section 3, we shall indeed discuss the following closely related 
partitioning model. Let x > be an infinitely divisible random variable with Levy 
measure concentrated on (0,1) with total mass oo (the so-called unbounded case). 

Assume Ex ^ < oo. Then the partition x = X]fc>i obtained from the consti- 

tutive ordered jumps ^(i) > ... > ^(^^ > ... of x- The system (^(fe); > l) constitutes 
a Poisson point process on (0, 1). Special emphasis is put on these partitioning models 
of the Poisson-Kingman type in Section 3.1. Their specificity is that each fragment 
in the decomposition of x has size physically bounded from above by 1. Several 
statistical questions arising in this partitioning context are then discussed among 
which: fragments correlation structure, cumulative sum of the K largest items, par- 
tition function, threshold statistics, filtered partition, typical and size-biased picked 
fragments size. All these statistical questions are of concrete interest. 

In Section 3.2, the following related partition is also considered: let = C(fe-i) ~ 
^(fc)) k > 1, (with ^(0) := 1) stand for spacings between consecutively ordered ^(fc)S. 



Then, J2k>i^k = 1 and (^fe,fc > 1) constitutes an alternative random partition of 



unity. In sharp contrast with limiting partitioning models of Section 2, its con- 
struction does not involve any normalization procedure. Its specificity rather is a 
consequence of the Levy measure for jumps of x to be concentrated on (0, 1) leading 
to (0, 1) —valued C(/c)S. Similar statistical questions arising in this partition context 
are also addressed. 

In Section 4, a remarkable special case of the partitioning models developed in 
Section 3 is studied in some detail. It corresponds to the following particular model: 
assume the unbounded Levy measure of x takes the particular form: 9/x- 'lxe{o,i)dx. 
Then, the random variable x has Dickman distribution. The induced partitioning 
models have many remarkable peculiarities which are outlined throughout. In partic- 
ular, the spacings partition has Griffiths-Engen-McCloskey or GEM(0) distribution 
whose ordered version is Poisson-Dirichlet partition. The PD(0) was obtained in 
Section 2 from a very different construction based on normalization. 

Finally, in Section 5, we assume that the Levy measure of x is now with finite 
total mass (the bounded case). In this case, we are led to random partitions of x or 
of unity into a finite Poissonian number of fragments. Some of their properties are 
briefiy outlined. 

2 Exchangeable Dirichlet Partition with Finitely Many Frag- 
ments and its Poisson-Dirichlet Limit 

We start with recalling a standard construction of the Poisson-Dirichlet partition 
as a limiting partition from the exchangeable Dirichlet partition of unity. 



Suppose there are oo > n > 1 fragments with random sizes, say (?i,...,^„), 
where (d, c:„) has exchangeable distribution, implying in particular that each <;k, 




Dirichlet partition 
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k = l,..,n all share the same distribution, say the one of ? = (each item has 
statistically the same mass). We also assume that ^ has a density fq{x) > on 
(0, 1) with total mass 1 and that X]fe=i = 1 (almost surely) which is a strict 
conservativeness property of the partition. 

With a > 0, we assume more specifically that (ci, ..,c„) is distributed according 
to the (exchangeable) Dirichlet-D„ (a) density function on the simplex meaning 

(2-1) /(-i'"'-") = ^n-r^-^(EK....-i)- 

^ ' fc=i 

In this case, ft = A; = l,..,n and the individual fractions are all identically dis- 
tributed. Their common density on the interval (0, 1) is given by 

1 [a) 1 [[n — 1) a) 

This is the one of a beta(Q:, (n — 1) a) random variable, with moment function 

1 [na + q) 1 [a) 

The case a = 1 corresponds to the uniform partition into n fragments for which 

{q + n-l){q + n- 2)...{q + 1) 

This remarkable partition model is in the larger class of those for which ft = 
Sk/ {Si + ■■ + Sn) where the 5*^, k = l,..,n are independent and identically dis- 
tributed (iid) positive random variables. Indeed, assuming ~ gamma(a), the 
joint distribution of (ci, .., <?«) is well-known to be Dirichlet D„ (a) . 



Poisson-Dirichlet partition and the Kingman limit 

In such "equitable" Dirichlet model, consider the situation where n t oo, a J, 
while na = 6 > 0. Such an asymptotic was first considered by [10]. As noted by 

Kingman, (<ri, .., <r„) ^ D„ (a) itself has no non-degenerate limit. However, consider- 
ing the ranked version (^(i), ••,'?(«)) with ^(i) > .. > one may check that in the 
Kingman limit, (<r(/;) ,k = 1, ..,n) converges in law to a Poisson-Dirichlet distribution, 

say (^(fc)) k >l) 'i PD{6) with C(i) > •. > <;(fc) > ... The size-biased permutation of 

(^(fc), fc > l) is, say (ft, A: > 1) - GEM(6l), the so-called Griffiths-Engcn-McCloskcy 
law (see [4], Chapter 9). For this partition of unity, the following Residual Allocation 
Model (or RAM) decomposition holds 

k-l 

(2.3) = Ylvivk, k > 1. 

1=1 

Here (vh, k > 1) are iid with common law vi ^ bcta(l, 9) and iJi := 1— fi ~' bcta(0, 1). 
Note that ?i ^st •• tst ?fe ^st ••, and that (ft, A; > 1) is invariant under size-biased 
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permutation. 



As is well-known, the Poisson-Diriclilet partition can be understood as follows 
(see [4], Chapter 9 and [11]). Let x > be some infinitely divisible random vari- 
able whose Levy measure 11 {dx) is concentrated on (0, oo) , with infinite total mass. 
Assume more specifically that n {dx) = 9e~'^/x, a; > 0. This is the Levy measure 

for jumps of a Moran (gamma) subordinator {xt',t > 0), with X •= Xi ~ gamma(0). 
Let (^(fe),fc> l) be the ranked constitutive jumps of x with x = X]fc>i$(/c) ^iid 
^(1) > .. > ^(fc) > .... Let (^(fe) := ^(fe)/x; k> l) be the ranked normalized jumps, 
hence with 1 = J2k>i'^{k)- Then, (<^(fe); fc > l) has Poisson-Dirichlct PD(^) distribu- 
tion, independent of x = Xi- This interpretation of fc > l) in terms of normal- 
ized jumps of a gamma(0)-distributed random variable is the limiting manifestation 
of the fact <;k = Sk/ {Si + .. + Sn), k = l,..,n (with Sk iid positive gamma(a)- 
distributed random variables), characterizing the Dirichlet D„ (a) model. 

3 Partitioning Constructions Based on Integrable Infinitely 
Divisible Random Variables with Levy Measure Concen- 
trated on (0, 1) : the Unbounded Case 

Suppose some random resource (or energy, mass or amount of space) x ^ is to 
be shared at random between (possibly infinitely many) species (atoms, fragments), 
the amount of the individual share being necessarily bounded by one. This ran- 
dom partitioning model can nicely be handled from infinitely divisible (ID) random 
variables with Levy measure concentrated on (0,1) [See [12] and [13], for general 
monographs on infinite-divisibility]. 

If the physical interpretation of x is energy (as in earthquake magnitude data 
with X interpreting as the cumulative energy releases on Earth over some period of 
time), our construction is, to some extent, related to the Random Energy Model of 
Derrida (see [1] and [3] and its "Poissonian" reformulation by [14] and [15]). The 
partitioning nature of this problem is indeed well-known. In insurance models, the 
individual share can represent the amount of a particular claim resulting from some 
damage. In population genetics, it could interpret as species abundance in a large 
population. In a partition of mass problem, the share attributed to each of the 
constitutive element of the partition is generally called the fragment size or mass. In 
an economical context, the individuals share is their asset. In any case, the peculiarity 
of our model is that the individual share of the constitutive atoms of the partition 
are all physically necessarily bounded above. 

3.1 Random Partition of "Energy": The Model 

Let X > be an infinitely divisible random variable with Levy measure for jumps 
n {dx) supported by (0, 1) . Hence, with A € M, let 

(3.1) Ee-^^ = exp|-^ (l - e"^^) H (da;)| 

be the entire analytic Laplace-Stieltjes Transform (LST) of xs law. We shall assume 

that Ex = 61 < oo so that < xll {dx) = < oo. We shall also assiunc; that 11 has 
a (continuous) density, say tt. In this case, the density of x exists and is easily 



5 



seen to solve the functional equation 

/•xAl 

(3.2) xf^ {x) = f^{x- z) ZTT {z) dz. 

Ja 

As is- well- known, the random variable x naturally associated to (xt i ^ ^ 0) which 
is a process with stationary independent increments. The process (xt,t > 0) is a 
subordinator with no drift and non-negative jumps restricted to (0, 1) and X = Xi- 

Let II(a;) := J^U{dz). Two cases arise, depending on whether 11 (0) = oo (the 

unboimded case) or 11 (0) < oo (the boimdcd case) . In this Section, we shall first 
address statistical issues arising in the unbounded partitioning model. 

Random Partition of Energy x (Unbounded Case) : First Properties 

Here If (0) = oo, where 11 (x) = /J IT (dz) is the tail of the Levy measure. In this 
case, the total mass of 11 is infinite and x > 0- Plainly, we have 

(3.3) x = ^ri-\Sk) 

k>l 

where {Sk; k> 1) are points of a homogeneous Poisson point process (PPP) on the 
half-line (with Tk := Sk — Sk-i iid and exp(l) distributed) and 11 the decreasing 
inverse of 11. 

This decomposition constitutes a random partition of the random variable x in 
terms of the (infinitely many) ranked constitutive jumps of Xi, all bounded by 1. Let 

^(fc) := 11 (Sk), fc > 1, be such (0, 1) —valued jumps arranged in decreasing order 
^(1) > .. > ^(fe) > ... They constitute a PPP on (0, 1) with intensity 11, satisfying 

^k>i^ {^(k)) = In the decomposition of x model, with x = J2k>i^ik), the 
random variable ^(^k) interprets as the ktli fragment size. 

When = 1, the PPP system (C(i); ■■■>i{k)j ••) is said to be conservative in average. 
If 6 > 1 {0 <1) we shall say that the system is excessive (defective). 

First and Second Order Statistics: Correlation Structure 

We start with supplying easy informations. 

First, from the law of large numbers in(^(fe-|) 1 almost surely as fc t oo, 
supplying a useful information on the way ^(j.) goes to when k grows. 

Next, the one-dimensional distribution of ^(fe) is easily seen to be P (^(fe) < x) = 
P [Sk > Ti{x)) = e-n(^) Efjo^ In particular, 

1 f°° 1 

/'OC/'OO k—l—sil — l—S2 

E(C(fe)^(,+,)) = y^ n-\si)n-\si + s2) '' r(fc)r(o' "^'^^'^ 
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are the first, second moments of ^(fc), together with the joint second moment of ^(fc) 
and ^(fc+i), k and Z > 1. Prom this last expression, there is no general stationarity 
property to be expected. As a result, with Wk ■= ^^{k), the second order quantity 
and its (gi, 52) —definition domain 



(3-4) Qiqi,q2):=E[^Wkqi)e^l^,) 

k>l 

deserves some interest. It gives weight Wk to the fcth contribution to the full pair- 
correlation function at distance I. 



Assume Il{dx) = ^lxe{o,i)dx, then 11 ^ (s) = cxp{—s/9}. Then xs distribution 



Example (Dickman): 

Assume 11 (dx) = |l 
is closely related to Dickman distribution (see below). In this case, 

E(C(fe)C(fc+i)) = (^-1^2 



9 + 2J \0 + l 



2k 



are the mean, variance of ^(j.) and correlation of ^(k)i C(fe+i)- ^^^^ particular 
separable case, the covariance coefficient cov (C(fc)) ^(fe+o) ~ '^^ (^(fc)) (sTi) ^^'-^ 



has exponential decay with I. More generally, in this particular case, with definition 
domain g'l + g'2 > —0/ {0 + 1) and q2 > —0, we obtain 

(3.5) a (91,92) = 



e + {e + l)iqi + q2) \e + q2 

This particular Dickman-model exhibits many other remarkable properties; these will 
be emphasized in some detail in the sequel. □ 



Ccimpbell Formula 

A very useful formula in our context is Campbell formula. We first recall it and 
then show its usefulness in the computation of statistical variables of concrete interest 
in the partitioning problem under study. 

Let A > and 5 be a measurable function such that (l — e"'^^^'^^) 11 (dx) < 00, 
then by Campbell formula (see [16]) 



(3.6) 



Eexp J -A^^ff (n \Sk)) I =exp|-^' (l - e'^s^^)) H (dx) 
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is the Laplace-Stieltjes transform (LST) of J2k>i 9 (j^ ^ {^k)j ■ In particular, its 
mean value is 



/.I 

k>i "^0 



Let us draw some conclusions of these elementary facts. 

Cumulative energy of the iiT— biggest and of the remaining events 

Let K > I. The random variable Xk '^k>K ^(k) represents the amount of 
total energy x concentrated in the lowest energy levels (at rank K + I and be- 
low). Let us consider the problem of computing its law. We clearly have Xk = 
J2k>i n ^ {Sk + Sk) where the PPP [Sk; fc > 1) is independent of Sk ^ gamma(iv'). 

As a result, applying Campbell formula (3.6) with g{x) =11 ^ [Sk + H (a;)), we ob- 
tain 

E exp { - Ax^ } = E exp I - A ^ n" ' ( 5k + n (C(fe) ) ) I 
(3.7) = Eexp |- (i - e-^n"'(^-+n(-))) n {dx)^ , 

where the last expectation is over Sk- Note that x = xi+X/f where xt: ■~ SfcLi ^(fc) 
is the contribution of the K largest energy levels to x- 

This question is closely related to the following problem: let a; > be some 
threshold value. Define 

K (x) := inf (isT > 1 : x^ > a;) 

to be the first time the cumulated fragments size of (C(fc) ik > l) exceeds x. Then, 
P {K (x) > K) — F (^Xk ^ ^) ^-nd the distribution of K (x) results from the one of 
X'jc See [17] for similar considerations. 

Example (Dickman); 

Assuming LI (da;) = ^IxQ(o,i)'ix, then LI (s) = exp{—s/0} and 

Eexp{-Ax^} =Eexp|- (l - e-^n"'(^^+n(^')) H (rfa;)| 



= Eexp^-6i/ { 1 - e-^^'' ^] -dx 



This shows that, in this particular case, 

Xk = Rk -X and xj = (1 - Rk) ■ X 
where Rk '■= cxp{ — ^} € (0,1) is log-gamma(_ft', 6*) distributed, with Ei?|(. 



[0/ (0 + q)f, independent of x- When K = (loga (1 + 1/6*))"^ 
Ex^ is half the one of x- 



the average wealth 
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In this example, we shall show below that x ^ Dickman type distribution, see 
(4.5) below, resulting in an intricate distribution for Xk ^^^^ xtc ^'^^ consequently 
oiK{x). □ 



Partition Function of x 



Partition functions of energy arc interesting quantities. Taking in particular 



g{x) = x^^ in (3.6), the full Laplace-Stieltjes transform of Ylik>i^ik) obtained 



as 

(3.8) Eexp|-A^^f,)| =exp|-^'(l-e-^-')n(dx)|. 



Its mean value ^ 

(3.9) 0(/3):=E^4, = / x^n{dx) 

k>i •'^ 

is defined for values of /3 > /3, for which x^Ti{dx) < oo, with 

:=sup(/3: (/)(/3) = 00) e [0,1). 
Note indeed that (f)(0) = oo and <p (1) = 9. 

In this setup, the Levy measure 11 interprets as follows. 

Let (x) := X]fc>i -'- (^(fc) > ^® random number of ^(^^ exceeding x G 
(0, 1). Then 11 (x) is the expected value of this number. Indeed 



n (a;) = ^ P (^(fe) > or) = 5^ P (Sfe < n (a;)) 
fc>i fc>i 



= p-n(x) \^ \ ^ n(a;) ^ ^-n(x) \ " ^ (x) 

^(Z-1)!' 

fe>l i>/c i>l ^ ' 



The random variable X^j,>i Cj'j,) is called the partition function of x and 11 its struc- 
tural (or occupation) measure. 



The Numbers of Atoms of x above Cutoff e and the Contribution to 
Total Mass of those Atoms above and below e: Threshold Statistics 

The above considerations naturally suggest the following problems of interest in 
Statistics. 

• Upper- threshold Statistics. 

If e G (0, 1) is some cutoff or threshold value, let Nj^ (e) count the numbers of 
atoms of the partition of x exceeding e. If x is the amount of some natural resource 
to be shared between infinitely many agents on the market, e stands for the minimal 
individual wealth below which each agent should be considered as indigent (e.g. be- 
low the poverty line). If x stands for "energy", e could interpret as the level below 
which micro-events are undetectable by the currently available measuring devices (if 
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one thinks of a sequence of earthquakes magnitude data for example). 
By Campbell formula 

E exp {- AiV+ (e)} = exp | - (l - e-^i(^>^)) n {dx) 

(3.10) =exp{-II(e)(l-e-^)} 

is the full Laplace-Stieltjes transform of iV+ (e). This shows that 7V+ (e) is in fact 
Poisson distributed with intensity 11 (e). Recalling 11 (e) oo, the law of large 

numbers gives 

(3.11) 7V+(e)/n(e)'^-4l,eiO. 

That (e) is Poisson distributed may be also checked as follows: we have N+ (e) = 
inf (fc > 1 : C(fe) < e) -1 and P (7V+ (e) > A;) = P (^(fc) > e) . This is also P (5fe < IT (e)) = 
g-n(e) ^^^^ IlisL and (e) is Poisson with intensity IT (e). 

Remark (randomization of the cutoff): 

A slightly more general problem is to consider the random variable iV+ (ei) := 
Sfc>i ■"■ (5(fc) > Efe) where {ck, k > 1) are iid (0, 1) —valued random variables, inde- 
pendent of (^(fc) , A; > l) . In this model, the poverty threshold attached to each agent 
is assumed random but drawn from the same distribution and with mutual indepen- 
dence. 

Prom Campbell formula, we obtain 

Eexp{-AiV+ (ei)} = exp{-En(ei) (l - e"^)} , 

showing that (ei) is Poisson distributed with intensity EH (ei) if EH (ei) < oo. 

This is useful in the problem of random covering of (^(fe), fc > l) by random 
intervals with sizes (e^, fc > 1). In particular, the covering probability is 

P(A^+(ei) = 0) = exp{-En(ei)}. 

Assuming (Dickman): 11 (e) = — ^^loge and ei Uniform(0, 1), the intensity reads 
En (ei) = -0 /o log ede = 0. In this case, iV+ (ei) simply is Poisson(6') distributed. □ 

The contribution to total mass x of those ^(fc) above e € (0, 1), which is 

X+ (e) := (€(fc) > e) = ^(fc)' 

fe>i 

is such that 

Eexp{-Ax+ (e)} = exp 
= exp 



fc=i 



1-e 



-Xxl{x>e) 



)n(rfa;)| 



-Aa; 



) n (dx) 
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This is the LST of an infinitely divisible (ID) random variable with Levy measure 11 

concentrated on (e, 1) for which clearly x+ (e) X i 0). This random variable is 
of the compound Poisson type since 

(3.12) Eexp{-Ax+(e)} = exp|-n(e) (^1-^ e-^^H (dx) /H (e)^ | . 

Here indeed, (dx) := Il{dx) /11(e) is a probability distribution. Hence, with Uk, 
fc > 1 an iid (0, 1) —valued uniform sequence, a Poisson random variable with 

intensity n(e), x+ (e) — XlfeLi -^e (^fc) belongs to the class of compound Poisson 
random variables. Stated differently 

(3.13) x+(e)=X]^e"'(«(fe),Pe) 



where W(i)^p^ < .. < U(p^)^p^ is obtained from uniform sample Ui,..,up^ while or- 
dering the constitutive terms. Wc note that %+ (c) has an atom at x+ (c) = with 
probability e^'^*^*^-'. This partition is also x+ {^) = SfeLi (n (e) u^j.) p^) where, 

when e I 0, Pe oo and (H (e) w^i^^p^, .., H (e) U(fc)^pJ {Si, .., Sk, ■■) a Poisson 
point process on M+. Thus the decomposition of x+ (e) constitutes a weak Poisson- 
partition approximation to the one of x- 

Remarks: 

{i) A slightly more general problem is to consider the random variable x+ (^i) ■= 
Sfe>i ^Cc)-"- (^(fe) ^fe) ■^here (e^. A: > 1) are iid (0, 1) —valued random variables, in- 
dependent of k > l). Prom Campbell formula, performing an integration by 
parts, with F^^ (e) = P (ei < e), we obtain 

Eexp{-Ax+ (ei)} = exp|-E^ (l - e"^^) H(dx)| 

(3.14) =exp|-j^ (l-e-^^)F,, (a;)n(rfx)| 

showing that, in general, x+ (^i) is an ID random variable with Levy measure for 
jumps Fei {x) n (da;) . 

Note also that if X- (ei) := Z^fe>i ^(fc)^ (^(fc) ^ ^k), clearly 

(3.15) Eexp{-Ax-(ei)} = exp|-^ (l - e"^") Jx) H (dx) 
where F^^ {x) := 1 — F^^ {x) . 

(ii) Finally, in the random covering of (^(fe), A; > l) by random intervals (efc, fc > 1) 
context, the quantity Xg '■= J2k>i (^(fc) ~ interprets as the total gaps' length 
(in the economical context, it is the excess- wealth of the well-off agents). We obtain 
directly 

Eexp J -A^ (^(fc) - I = ^xp S^^E J^' (l - e-^^^-^^)) H (dx)| 
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(3.16) =exp|- (l-e-^^)En,, 

where Tle^ (dz) is the image measure of 11 (da;) by the apphcation x ^ z = x — ei G 
(0, 1 — ei). This is the LST of an ID random variable with Levy measure for jumps 

EU,, (dz). □ 

Examples (Dickman).' 

Assuming n (dx) = |la;g(o,i)C?a;, we have (dz) = j^lze(,o,i-ei)dz. 

• If in addition, ei is uniformly distributed on (0, 1), we find explicitly 

(•1 — z g 

EHei {dz) = dz de = -0\ogzdz,z G {0,1) . 

Jo z + e 

In the chosen example, the total gaps' length is an ID (rate 6 compound Poisson) 
random variable with logarithmic density for jumps — Iog2;l2g(o,i) and Exg/Ex = 
— xlogxdx < 1 is the average reduction factor. 

• If ei is not random, with ei ~ S^i-e, the total gaps' length is a (rate — 6'loge) 
compound Poisson random variable with density for jumps: (^z+7)\oge '^ze{o,i-e)- In 
addition, one gets Exg/Ex = 1 — e + eloge — > 1 (e J, 0+). 

Incidentally, note that the lack of wealth of the poorest, which is X]fe>i (^fe ~ + ' 
diverges. □ 



• Sub- threshold Statistics. 

Similarly, let N- (e) := X]fc>i ^ (^(fc) — ^) count the random number oi below 
cutoff e e (0, 1) , then (e) = oo for all such e. The contribution to total mass x of 
those ^(fe) below e e (0, 1), which is 

X- (e) := fefc) < e) = ^(fc) 

fe>l k>N+{e) 



is such that 



Eexp {-Ax- (e)} = exp |- (l - g-^^^^^^^)) E {dx)^ 
= exp|-^ (1 - e"^^) n (dx) 



This is the LST of an ID random variable with Levy measure 11 concentrated on 

(0,e), showing that X- (f) and x+ (f) a-re independent with x = X- (e) + X+ (c)- 
Furthermore, Ex- (e) = /J cell {dx) ~ e^n (e) ^eio and the variance cr^ [x- (e)] ~ 
e^TT (e) — >eio 0. As a result, 

* If (7 [x- (e)] /Ex- (e) ^ 0, one can check that the Central Limit Theorem holds 
eio 

(3.17) X-(e) -Ex (e) ^ _ 

a [x- (e)] 6iO 
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* If CTT (e) ^eio a > 0, then a [x- (e)] /Ex- (e) and we easily find 

ej.0 

(3.18) ^ ^„ ^ith Ee-^^" = e-«/o (i-^-^^'")'^-/-. 

Ex- (e) eio ^ 

The limiting random variable is (mean 1) infinitely divisible with major interest. 
It will be studied in some detail in the sequel (see Subsection 4.1). 

Although the atoms below the cutoff are infinitely many, their contribution to 
total mass always goes to as the cutoff approaches 0. 



Weighted Partitions (Modulation) 

Let {fik]k > 1) be a sequence of iid non-negative random variables, indepen- 
dent of (^(fe); k > l). We shall investigate some properties of the weighted or mod- 
ulated random sequence {lJ,k£,{k)l k > l) as a new random transformed partition 
of Xm •= X^fc>iMfcC(fc)- This question appears in the following problem: assume 
the events {S,(k) \k > l) , summing up to X; arc each corrupted by some multiplica- 
tive independent noise {fik'jk > 1); then the observed sequence of events becomes 
(/i/s^(fe); fc > l) and the observed cumulative energy turns out to be Xm- 

Let us first consider the quantity Eexp ^—X^k>i9 (M/sC(fe))|- From Campbell 
formula, we have 

Eexp I -X^g{i^kS,{k)) ? = IE S n ^ (exp-Afif {nk^(k)) I C(fe)) 

( k>l J [fe>l 

= exp|-Ej|| (l-e-^s(''i^))n(rfa;)| 

(3.19) =exp|-Ej^''' (l-e-^5(^))n^, (d^)| 

with [dz] the image measure of 11 (dx) by the application x ^ z = ^i\x. 
Examples (Dickman).' 

We note the scale-invariance property 11^^ {dz) = 11 {dz) when 11 {dx) = ^dx. 
Using an integration by parts and putting F/^^ {x) = P {/ii > x) , this shows that in 
this particular case for 11 only 

(3.20) Eexp|-A|:5(/^.C(.))|=e-/o-(i--^^<^'F..(^)n(^^). 

In particular, X]fe>i Mfe^(fe) i'' a positive ID random variable with no negative jumps 
whose induced Levy measure for jumps is ^^''^^^^ dx. For example 

{i) p-thinning: if Bernoulli(p), the new Levy measure is ^Ix^(^o,i)dx. 

{a) uniform thinning: if /xi ~ Uniform(0, 1), the new transformed Levy measure 
IS ^-^lxe{o,r)dx. 
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{in) exponential scaling: if /zi ~ exp(l), the new Levy measure is ^e~^lx>odx 
which is the one of a gamma (or Moran) subordinator. Note that /ii^(i) tst •• tst 

A*fe^(fe) ilst --and that, although the sequence (/ife^(fe);fc > l) is not strongly ordered 
by decreasing sizes, the constitutive terms sum up to a gamma(^^) —distributed ran- 
dom variable. This should not be confused with the other natural partition of x ~ 
gamma(^) given by 

fc>i 

where <,(j^) = 11 (5*^), k > 1 and n (x) = |cxp{— zjdz. This constitutes an 
example where two distinct sequences {lJ^kS,{k)]k > l) and (<;(fe);A; > l) both share 
the same partition function. □ 

Typical Fragment Size from (^(fe), A; > l) 

We can define the typical fragment size from (C(fe), k>l) to be a (0, 1) —valued 
random variable, say ^, with density f^{x), whose distribution function (x) is 
defined by the random mixture 

F^{s) = J2y^kF^,,, is). 
fe>i 

Here, weights Wk = (^(fe)) satisfy Wk G (0, 1) and J2k>i '"'fe = ^- With ip^ (q) := 
E^^, its moment function is equivalently given by 

(3-21) <^«(«) = JE^(^w)^«w('?) 

fe>i 

in terms of f^^^-, {q) ■= E^^^^, the moment functions of ^(j.). 

Size-biased Picking from (^(j.) , > l) 

Let r] he a (0, 1) —valued random variable taking the value ^(fe) with probability 
|C(fe) given > l). This random variable corresponds to a size-biased picking 

from (^(fe),/: > l). Its moment function is 

(3.22) (g) = Er,« := E^ ^ i^,)^,^ = l^{q+ 1) 

fc>i 

{ovq>q* :=/?*-! e [-1,0). 

The waiting time paradox reads 

(3.23) T] hst ^, 

a stochastic domination property translating the fact that in the size-biased picking 
procedure, large fragments are favored. 
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3.2 Spacings and Strong Partition of Unity: Normalizing 

The partition (5(/c); k > l) of x induces another natural partition of unity de- 
fined as follows. Define the incremental random variables '■= ^(fc-i) — C(fc) (with 
^(0) := 1), A; > 1. Then, ^^fe, fc > 1^ defines a new sequence of (0, 1) —valued random 

variables with clearly ^k>i^k = 1 (almost surely). The ^^fe; A; > 1^ constitute a 
strong (almost sure) random partition of unity built on Spacings between con- 
secutive ordered energies sum up to 1, which is the top energy a single event can 
develop according to our assumptions. This model was first considered by [18] and 
reconsidered by [19] in the context of combinatorial structures. 

That this construction is possible is indeed a consequence of 11 being concentrated 
on (0, 1) leading to (0, 1) —valued C(fe)S. Note that there are no reasons, in general, 
for the ^feS to be ordered either in the strict or weaker stochastic sense. The ordered 
version of ^^fe; k> ij, say (^^(fe); > 1^ , is thus also of some interest. 

This construction should not be confused with another partition of unity which 
can be defined from the system of ordered normalized random weights C(fe) := C(fe)/X) 

k > 1 satisfying X]fe>i'^(fe) ^ ^ ^-"^^ ''(i) > •• > '•(fe) > ••• this kind of parti- 
tion of unity, the coiidition that Levy measure 11 of x be concentrated on (0, 1) is 
inessential. When 11 is concentrated on (0, oo), one speaks of Poisson-Kingman par- 
titions (see [20]). For instance, when Il(dx) — de~^/x, x > 0, is the Levy measure 
for jumps of a Moran (gamma) subordinator {xut>Q), (<^(fc); k > l) has Poisson- 
Dirichlet PD(^) distribution, independent of x = Xi- For such problems, the joint 
law of (x; T(fe), k = 1, ..,tj for each I > 1 deserves some attention. They are given by 
Perman formulae (see [21], for additional details). 

Strong Partition Function of Unity from x- Structural Measure 
Let 

0(/3) :=E^^f, with/3>/3, G [0,1). 
fe>i 

With := 0, averaging over Sk ^ gamma(fc) , k > 1, (/3) can be obtained in 
general from 



(3.24) ^{(3) = J2 e-'E {u~\Sk-i)-U~' {Sk-i+t)j 



dt, 



recalling Sk — Sk-i + Tk where Sk-i is independent of Tk ~ exp(l). The measure 
a (dx) such that (j) (/?) = x^a {dx) is called the structural measure of the partition 
(^^k, k> 1^. With a (x) := a {dz), recalling P {Sk € ds) = j^^^js''~^e~^ds, it can 
generally be obtained, after a change of variable, from 



/•oo 

^ (^) = H / e-*P n"' {Sk-i) - n"' {Sk-i + t)>x 



dt 
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(3.25) 



g-n(i-.) ^ r -[n(n-(s.)-.)-5.] . < n(x) j 
fc>i 

k>i 

= e-n(i-.) + r e-P(^-^)-"(^)]7r (^) d^. 



The random variable X^fe>i is called the partition function of unity constructed 
from X and a defined by a{x) = J2k>i ^ {^k > its structural measure. 

Cutoff Considerations for Spacings Partition 

1/ Overshoot. First, we note that, if ^j. := 1 — ^; is the amount of space 
left vacant by the k first atoms of ^^fe, k> 1^, we get ^j. = ^{k)- This remark allows 
us to derive the following result. 

Let X e (0, 1) be some threshold value. Define 

K {x) := inf ^fc > 1 : > a;^ 

to be the first time the cumulated fragments size of (a,fc> l) exceeds x. Then, 

if(x)^l + PA(,) 

where Pa(x) is a Poisson distributed random variable with parameter A (x) := 11 (1 — x) . 
Indeed, P (K (x) > k) =F (^zti < .x') = P {^(k) > I - x) . 
The random quantity '^^i ^ ^; — x is the overshoot at x. 

2/ Let A/+ (e) := J2k>i ^ {^k > be the random number of spacings exceeding 
e e (0, 1). Then, from the above expression of a{x) in Eq. (3.25), evaluated in a 
neighborhood of x = 0, 

m+{€) = a{e) ~aon(e), 

and Chen-Stcin methods for Poisson approximations of Af^ (e) could be developed, 

in the spirit of [22]. 

However, P (inf (l > I : < ej > k^ = P (^Ai^i^k > e) where Af^^Cz is the small- 
est term amongst ^^i, .., ^fe^ and it is no longer true that jV+ (e) = inf ^A; > 1 : < — 

1 because the £,k are not ordered. _ 

The contribution to total mass of those ^k above or below e are respectively 

1+ (f) '■= Y.k>i ffcl (Jk > and 1_ (e) := J2k>i ^k^- (jk < • The full laws of these 
quantities are difiicult to obtain in general. Indeed, 



Eexp{-A^+ (e)} = E JJ (l - (l - e"^) I (^fc > e 
fe>i 
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Eexp{-Al±(e)} = E[] (l _ (l _ e-^«'=) I ^e)) 

k>l 

and the joint laws of the are required. However, as e J. 0, it still holds that 

1+ (e) ^ 1 and El_ (e) ~ ^ 4 (e) = (e) ~ e^Tr (e) . 

fe>i 

Size-biased Picking from ,k>lj 

Let be a (0, 1) —valued random variable taking the value ^fe with probability 
given ^^fe, fc > 1^ This random variable corresponds to a size-biased picking from 

(^^k,k > ij. Its moment function is 

(3.26) v'^(g)=E579:=E^f«+^=^(9+l) 

fe>i 

for g > :=/?*- 1 e [-1,0). 

Just like for the pair 77 and ^, the waiting time paradox reads 77 ^st ^• 

4 Examples: Dickman pcirtition and related ones 

We start with a fundamental example in many respects, for which most computa- 
tions can be painlessly achieved. We call it Dickman partition for reasons to appear 
later. The peculiarities of this model clearly appeared in the Examples developed to 
illustrate the general partition model under study in the previous Sections. 

4.1 Dickman Partition 

• Assume Il{dx) = ^I^^(^o,i)dx. Then n(a;) = -^loga; and (s) = e'^^^. The 
LST of X in this case is 



Ee"^^ = 




Let us now proceed with the detailed study of the multiplicative structure of this 
partitioning model. 

Partition Function 

In this example, we have £_(^k) = 6"^*=/^ = Y['i=i where Bk are iid with beta(0, 1) 
law: P(i?i < x) = . The ^(j.) arc thus log-gamma(fc, 6') distributed. We obtain 
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for /3 > /?* = 0. The structural measure is a (dx) = ^dx. 

Note also that — f log {^(k)) — * 1 almost surely as fc t oo so that ^(fe) goes to 
exponentially fast with k. 

Note that, exploiting the product decomposition of the if ^(i) ;= ^(i)/x is 

the largest normalized fragment of the ^(fe)'s, we get ^(i) = 1/ (1 + x)- Consequently, 

(4.2) Ee-^/^(^> = exp |-A - 61^ ^— |-^da;| 
is the infinitely divisible LST of l/^(i) > 1- 

Correlation Structure 

The correlation structure of the Dickman partition has already been computed 
in a former example. The result is the expression of C; {qi,q2) in (3.5). 

Typical and Size-biased Fragment Size 

The size-biased picking random variable r] from (^(^^ ; > l) has uniform law since 

(4.3) E,,«:=Ei^^gi = l/(g+l). 

k>l 

The typical fragment size ^ has law given by 

(4.4) <^5(g)=Ee:=^^EC(,)Ee^,) = - ^ 



fe>i 



{i + e)q + e 



corresponding to a bcta^ ij distribution. It is true that t] ^st £, since Fj^ (x) = 
X < (x) = xTTF for all x G [0, 1] . 

Additional Properties: the Law of x 

With 7 the Euler constant, the random variable x has a density given by 
(4.5) (x) = e-^'x'-^Fe (x) /T {0) , x > 

where Fg {x) := P (D > x) is the tail probability distribution function of Dickman 
random variable D 



Fe {x) = Ixe[o,i) + Ix>i 



j=i J •> ^ •' ^ iii=i-zi (=1 



with super-exponential von Mises tails — \o%Fo{x) ^x^oo a; log a; (see [11]). When 
a; > 1, the function F^ [x) is the solution to 



^Fo{x) = Q r z^-'Fe{z)dz. 

Jx-l 
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The random variable D > 1 turns out to be the reciprocal of the largest normalized 
jump of the Moran (gamma) subordinator {D = l/c(i)). 

This relationship between f-^ (x) and Fq (x) may be seen to be a direct conse- 
quence of the identity 



exp ( — 6* / dx + log A + 7 — exp —6 / dx 



1 



1 I _ ^-\x 



involving the exponential integral fimction Ei (A) = ^ dx (sec [11]). For these 
connections with Dickman's function, we shall say from (4.5) that x follows Dickman 
distribution. 



Moment Function of x 

The moment function of x, say Ex'^ = /q°° x'^ fx i^) dx, is defined for q > —6, with 



Ex« = — — -ED«+^ 

^ r (9) {q + e) 

It can be computed as follows: first, x = C(i) (^1 + X ^ where X = X is independent 

of C(i) ^ beta(6l,l) with E^^^^ = 0/{9 + q). Thus, X is a Vervaat perpetuity of a 
special type. 

Let {6) ■= from this, using the binomial identity, the integral moments 

m„ (9) of X are first obtained recursively by mo (9) = 1 and 

with = Hence 

mi {9) =9, 

m,{e) = ^-{l + 2rm{9))=^-+9\ 

m3 {9) = ^ (1 + 3mi {9) + 3m2 (0)) = ^+'^ + 9^ 

are the three first nested moments. Searching for solutions under the polynomial 
form 

n 

mn {9) = ^bk,ne\n> 1 
fe=i 

wc can identify the coefficients 6fe,„, k = 2, ..,n as the ones solving (6i,n = ^/n) the 
Bell numbers-like triangular recurrence 



p=fe— 1 ^ ' 



..,n. 
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Next, with (q)^ := q {q — 1) .. {q — n + 1), the fuh expression of Ex' is 
(4-7) lEx« = ^ + ^ (0) j ,q>-e, 

translating the identity Ex^ = g^^E (1 + x)' • Incidentally, 

ED«=r(^ + l)eT^ {l + J2'^—r^^n{0)] ,q>0 



n>l 



is the moment function of D = 

»(fe) 



Let Xi//3 := X^fe>i ^ffe)) with (3 > 0. From Campbell formula, 



(4.8) Ee-^^V/3 =exp|-i (l - e"^^) H = (Ee"^^)^^'' . 

The moment function Ex'y^ can thus readily be obtained from the one of Ex' while 
operating the substitution 9 ^ 9/(3 in the obtained formula. Hence 

(4.9) Ex?/, = ^ I 1 + E ^^-'^ (^/m ' > -^//^ 



,iit; muiiiciiu iuuciiun ui uue pai uiuiuu iuucuiuu ^j.>]^ C,^ 
g^:g{^ + 0/P) = 9/(3, as required. This constitutes a complementary information 



is the moment function of the partition function Xlfe>iC(fe)- l^ots that Exi//3 



to the one encoded in the LST of X^fc>i ^^^y This suggests the following additional 
construction. 



Renyi's Weighted Averages 

Let <,(k) := ^{k)/Xi A: > 1 be a system of normalized random weights. With 
/? > — 1, define the random Renyi /3— average [^], (with random weights <^(^;)) of the 
(^(fc),fc>l) to be {(3>-l) 



This random variable is (0, 1) —valued when (3 > —1 and when (3 < —1, it degenerates 
to 0. The 2— average (^)2 is often considered, but [^]g := lim/j^o [C]/3 = nfc>i'?(fe) 
also sometimes of interest. The function (3 ^ [^], is non-decreasing with /3 and 
>I^U if-l</32</3i. 

Note that E [^]^ is also E [e"'^"] where = — log [^], is the random Renyi 
/3— entropy of the sequence (^(i), ••)> with, in particular, Hq = — log[^]o = 

— J2k>i ''fe log^(fe), related to Shannon entropy. 

The computation of its moment function turns out to be difficult as it stands. 
Indeed, recalling that {xt,t> 0) is a process with stationary independent increments. 
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the joint moment function of Xi/(i3+i) and X = Xi would first be required. 

Wc shall rather consider the simpler Renyi /3— average (with deterministic weights 

Wk) of the (^(ft), fc > l) to be 

(4.10) (0/3 := ( J2 ^'^^ffc) 1 ^^^^^ ^'^ = ^ f^>^- 

It may be checked that the range of the random variable (^^ is [0, 1] when (3 e (0, oo) 
which we shall limit ourselves to. Define the weighted sum 

fc>i 

Recalling Wk = ^ [eTi) ^(fc) ~ 11;=! where Bk are iid with ^(i) = Bi ^ 

beta(^, 1) law, we have 

W = -^{1 + 9W') 
9 + 1 

whore W = W is independent of Cf^) ~ beta(^|, l) . Let ^„ {0, 13) := jg^M^f) = 
(e+i)" e+nff ' fro™ this, using the binomial identity, the integral moments mn{0,(3) 
of W are first obtained recursively by toq (6*, /3) = 1 and 

^itli = (e+i)"(e+n/3)-e - this 

{A 111 EVF« = 6* - ^ ^ 

^ ^ (l + 0r(0 + /3g) 

and 

(4.12) E (0^ = ^W^'^ = / + ^">i^^"^"(^'^^ 

(l + ^)«/'^(^ + g) 



4.2 Spacings: an Alternative Construction of Poisson-Dirichlet 
Pcirtition 

Defining spacings to be '■= C(fe-i) ~ C(fe) (with ^(o) '■= 1); we clearly have 

fc-i 

(4.13) a=n^'^fe'^^i 

;=i 

where are iid with beta(l,(?) law: f [vi < a:) = (1 — xf . Thus (^£,k,k > ij has 
GEM(6») distribution with J2k>i = 1- 
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In this particular example, one can also check that ^st ■■ hst hst ■■ 
and the ^^/c,fc> 1^ are arranged in stochastic descending order. Finally, as is 

well-known, the ordered version ^ of (^^k,k > 1^ has Poisson-Dirichlet 

PD(0) —distribution and ^^k,k ^ l)) as a size-biased permutation from PD(0), is 
invariant under size-biased permutation. 

Correlation Structure 

With Wk ■= E^fc, the second order quantity to consider here is 

(4.14) Ci{q„q2):=ElY,nJka'a\i 

\k>l 

Prom the multiplicative RAM structure of the GEM(^) partition, 

fe-l /fc-l k+l-l 



k>l ^ ' \l=l l=k+l 



vrv 



I "k+l 



fe>i 



Vf \^-^ T {I + qi)T {9 + q^) 

" X 



+ 9) \9 + qi+q2j r{l+9 + qi+q2) 

^r(i + g2)r(i + ( 



^0 + q2j V r(l + 6» + 92) 

(4.15) =K{quq2 ' 
where 

(4.16) K{qi,q2) 



9 + q2, 

r(i + gi)r(i + g2)r(i + g) 

T{9 + q^+q2)[9+{9+l){qi+q2)] 



is defined iov {qi+ q2 > —9/ {9 + 1) ; {qi, q2) > — 1} ■ The definition domain of Q {qi, 92) 
therefore is {qi + 52 > — ^/ + 1) ; > —1; 12 > — min (1, 0)} . This should be com- 
pared with the expression of C; (gi, ^2) in (3.5), dealing with the unnormalized case. 



Cutoff Considerations for GEM(6') Partition 



Let A/+ (e) := '^^kyi ^ (^^^ > random number of spacings ^fe exceeding 

e e (0, 1). Then, using the RAM structure of {^k; fc > 1^ , we have 

7V+ (e) = I (t;i > e) + < (eM) I^,>, 

where AA^ (.) is a statistical copy of (.) . Here vi ~ beta(l, 6*) and Ui := 1 — wi is 
independent of A/"^ (.). In particular, if (e) = 1EW+ (e) is the order-p moment of 
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M+ (e), we have the fohowing recurrence 

/I 
v^-'^n^l'^ {e/v) dv and, with p>2: 

(e) = (1 - e)^ + ^ g (fc) /"^ -'"^"f 



+9 J v'^-^nf {e/v) dv. 



Let 



iV'^(g) := (\''n^l\e)de. 
Jo 



Prom the first recurrence on n^^ (e), recalling 



/' 

Jo 



T{q + e + 2) 

and using an integration by part, we obtain explicitly 

(■),-,_ r(, + i)r{9 + i) 



"V'fa"- r(, + . + i)(, + i) - 

leading to n^^ (e) = (1 — z)^~^ dz. The function (g') has a double pole 

a.t q = —1 and n^' (g) ~q=-i ^(g' + l) , showing, as required, from singularity 

analysis, that (e) —Ologe. 

Similarly, considering the contribution to total mass of those above or below e, 
which are respectively 1+ (e) := J2k>i (jk > e) and 1_ (e) := J2k>i (jk < ej , 
it holds that 

1+ (e) = vil {vi > e) + tJil+ (e/tJi) Ivi>e 
1_ (e) = < e) + vil_ (e/tJi)I^i>e 

where 1± (.) are statistical copies of 1± (.) respectively, independent of vi. Let 
x± (e) := E [1± (e)] stand for the mean values. Then, with 

o_|_ (e) := y V {1 — v)^ ^ dv and a_ (e) '■~ ^ j (1 ~ ^ cJii, 



we have 



x± (e) = a± (e) + ^ j v^x± {e/v) dv. 



Defining x± {q) := e'^x± (e) de, similar computations show that 

and {q) has a simple dominant pole at g' = — 1 with x+ {q) '^q=-i (g + 1) , 
showing, as required, from singularity analysis, that .x+ (e) '^eio 1- More precisely, 
inverting (4.18), x+ (e) = (1 — e)^ and x+ (e) ^eio 1 — de. 
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2/ 



(4.19) s_(g)= + ^ + + fr(g + + 3)-r(0 + 2)r(rj + 3) 



(<z + i)(g + 2) \ r{e + 2)r{q + + 3) J 

and x_ (q) has a simple dominant pole at q = —2 with x_ (q) ^g=_2 (O' + 2) ^ , 

showing, as required, from singularity analysis, that X- (e) ~ej.o ^pr^- 

Note that the average values x± (e) are known explicitly through inverting x± {q), 

even if e is not small (in particular x+ (e) = (1 — e)^ and X- (e) = 1 — ^^Tj-e— (1 — e)^). 

Singularity analysis of x± (q) gives the expected small— e behavior of x± (e). 
Similar recurrence on higher-order moments of 1± (e) could be obtained. 



Weighted partition: modulation 

Assume ^^(fc);/c > 1^ ~ GEM(0). Let (/Xfe;fc > 1) be a sequence of iid non- 
negative random variables. We shall investigate some properties of the weighted or 
modulated random sequence (^iJ-k^^k)^^ > 1^ as a new random transformed partition 

of Xm ■= Efe>i Mfef(fc)- Plainly, we have 

where = X/xi is independent of vi ~ beta(^, 1). Assume Cfe := E [/i^] < oo for all 
A; > 0. Then, with m„ := E [x^^] , we have 

m„ = ^ c„_fcE [v'l-''v'l] Wfe + E [v'l] nin- 

Recalling E [v['-''v'[] = 6» [(n - fc)!r {0 + k)] /r{9 + n+l) and E [v^] =9/(0 + n), 
setting hn ■= m„r {9 + n) /n\, we obtain the convolution-like recurrence (n > 1) 

n-1 

nhn = ^ X] ^"-fc'^fe- 

k=0 

If 



h (u) := ^ /i„w" 



Tl>0 



is the generating function of (/i„; n > 1), with c (u) := X]„>o CnU", we obtain u/i' (u)+ 
9h (u) = 6*0 (u) h (u), leading expHcitly to 

(4.20) h{u) = T{9)expl-9 I i^^dwl . 



V 

In more details, with Ck := {k — l)lck, with n > 1, we finally get 

r {9) nl .k rj r~ ~ ~ \ 

"^n = p X 7 ^ \-9) Bn,k{Cl,C2, ...,Cn-k+l) 
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in terms of Bell polynomials. 

The moments of the partition function 



fc>l 

of (/^fcC(fe); > can be obtained similarly, observing that, with ~ beta(^/^, 1) 



Example: 

When {fJtk'ik > 1) is a sequence of iid non- negative random variables drawn for 
Bernoulli(p) distribution, with Cfc = p. A; > 1, one can check that 

_ T{pe + n)T{e) 

"""" r(p^)r(0 + n)' 

solves the recurrence. This shows that x^j,^ := Xl/c>i A'fe^(fe) has beta(p0, (1 — p) 6*) 
distribution in this case (see [23]). □ 



Structural Mecisure 

With (3 > (3t, = 0, we also have in this case 

The measure a (dx) := | (1 — x)^^^ ^xeioA)^^ ^® structural measure of the GEM(0) 
partition. 

Typical Fragment Size ^ from ^^fe, fc > 1^ 
Its moment function is given by 



fc>i 



where if^^ (q) := E^^ = ^^^^gj^^^^^ (^^s) moment function of ^k- As a 

result 

^ r(g + i)r(g + i) ^ 

(4.22) 



(^ + 9 + l)(e + l)^V(^+l)(^ + 9) 
r(g + l)r(^ + l) 9 + q 



T{e + q + i) e + {e + i)q 
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showing that ^ = rj ■ R where rj ~ beta(l,0) is independent of the [0, 1] —valued 
random variable R, with moment function Eii' = ^qi^gipjy^. This interprets as follows: 

let S be a Bernoulli random variable with parameter and C ~ beta^g^rj-, 1^ a 

random variable on [0,1], independent of B. Then, is a [0, l]-valued random 
variable satisfying 

R = B+{1-B)-C 

Indeed, 

1 ^ e/io + i) _ e + . 



e + i e + ie/{e + i) + q e + {e + i)q' 



Size-biased Picking r] from ^^fe,fc > 1^ 
Its moment function is 
(4.23) E^f - ^Er^ = ^(.+ 1) = 

which is the moment function of a beta(l, 9) distributed random variable (as required 
from the size-biased picking invariance property of , A; > 1^ ) . 
The waiting time paradox _ 

Vhst C 

is clearly satisfied from the decomposition ^ = rj ■ R. 

In the next two examples, computations on spacings are quite involved. We skip 
them, focusing on the simplest aspects. 

4.3 Two Additional Examples 

We briefly sketch some properties of related partition models. 

• Let a e (0, 1) and put a := 1 — a. Assume 11 (da;) = 0ax~^^^°'^lxQ{o^\)dx. Then, 
with a = ^ > 0, 

TT(a;) = a (x"" - l) and II"^ (s) = (1 + s/a)"^/" . 

We have ^(fc) = (1 + Sk/a)~^^°' and 



k>l ''^ 



p 9a 9a 
a;i+" P-a 



for /3 > a = > 0. Note that | (C(fe)' — 1^ — > 1 almost surely as fc t oo and 

^(fc) ^ (a) Soes to algebraically fast with k (like fc~-^/") in this case. Spacings 
are more involved. 
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• Let a e (0, 1), put a:= 1 — a and assume 

n {dx) = f (1 - xT-^ l.e(o,i)dx 

L {a) I [a) 

with xH {dx) = 9. We have 
for /? > a = /?* > 0. 

When a = 1/2, II (a;) = ^ (i^)^^^ and II"^ (s) = (l + (ff)^)"^- We have 
C(/c) ~ {}~^ (^W^)^) and II ^^(^^ — 1^ ^1 almost surely as fc t oo. As a result, 

^(fc) (wf)^ Soes slowly to (like in this case. The associated size-biased 
picking random variable r} follows the Arcsine(l/2) law. 

In both examples, an a larger than 1 would violate the condition that 11 is a Levy 
measure. 

5 The Bounded Partition Model with a Poissonian Number 

of Fragments 

We finally briefly show that the partitioning model based on Levy measure con- 
centrated on (0, 1) is also of some statistical relevance in the case of a bounded Levy 
measure for jumps. This model does not seem to have received attention in the lit- 
erature. 

In the bounded case, let /x := 11 (0) < oo. In this case 
(5.1) Ee^^x =exp|-/x^l-^ e"^^F(da;) 

where F {dx) = H {dx) //x is a probability distribution with mean value 9/fi < 1. 

5.1 Poisson Partition of x- the Model 

Hence, with Ufe, A: > 1 an iid (0, 1) —valued uniform sequence, a Poisson ran- 

dom variable with intensity /j., x = X]fe=i P (■"*;) belongs to the class of compound 
Poisson random variables. Stated differently 



(5-2) X = E^"'(WJ 

fe=i 

where 'U(i),p^ < .• < M(p^),p^ is obtained from sample ui,..,up^ while ordering the 
constitutive terms. We note that x lias an atom at % = with probability 
and that there are finitely many (Poissonian) fragments. This partition is also x = 
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Efe=in ^ (/^■«(fe),p^) where, when fi 1 oo, "-^ oo and (/xm(i),p^ , .., /iM(fc),p^) 
{Si, ..,Sk, ■■) a Poisson point process on M+. 

Defining ^(fc),p^ := F" («(fe),P^) , with ^(i)_p^ > .. > ^(p^),p^, we get similarly 

(5.3) = E effe),p^ = / x^n (dx) 

k=l •'^ 

and the structural measure is 11 (dx) = jiF {dx) which is bounded with xll {dx) = 

e. 

Remark: Assume that the distribution F is the one of a beta(0/ {ii — 9) ,1), ^ > 6, 
with mean value 6/^. Then 

n(x) ^^Too -eiogx 

which is the Levy-measure tail of the limiting Dickman model discussed in Section 
4.1. □ 

Typical Fragment Size from (C(fe),p^,A; = 1,..,^;;^) 

In this case, the typical fragment size from {£,{k),Pi^, k — 1,..,P^) clearly is the 
(0, 1) —valued random variable, say ^, with probability distribution {x) = F {x) 
with = 9/ii. 

Size-biased Picking from (^(fe),p^,fc = 1,..,P^) 

Let r/ be a (0, 1) —valued random variable taking the value £,{k),p^ with probability 

])i(k).p„ given (C(fc),p^,fc = lj--,-Pp)- This random variable corresponds to a size- 
biased picking from (^(fe),p^,fc = 1, ■■iP^)- Its moment function is 

p 

(5.4) (g) = := eI = ^0 + 1) 

fe=i 

and T] has probability distribution (a;) = ^ JJ' 2;F| (d^) . 
The waiting time paradox reads 

since Fr, {x) < F^ {x) for all x e [0, 1] . 
5.2 Spacings 

As in the unbounded case, spacings of the Poisson partition of x deserve interest. 
Defining the spacings ^(fc),P^ := C(fc-i),P,,-C(fe),P„ (with^(o),p^ := 1 and ^(p^^+i)_p^ := 

0), fc = 1, ..,P;^ + 1, then, ^C(fe),p^i^ = Ij --jPfj. + 1^ constitutes a new sequence of 
(0, 1) —valued random variables with clearly X]fc=i ?(fe),P^ = l- Let for example 

fc=i 
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With W(o),p^ •= and M(p^+i) := 1, it can be obtained in general from 
(5.5) <P{i3)=eY, (p-' {u^k-i),p,) - F'' (u(fc),pj) 



fc=i 



so that, given = p, the joint law of {u(k-i).p;u(^k),p) is needed to compute 
or the structural measure a such that (j) {(3) = 

Example: 

Let n (dx) = /iIj.g(o,i)(ia;. Then F {dx) is the uniform distribution {F {x) = 
l-x) and 61 = /i/2. Wc obtain E ^j^^.^ p,, = JTi=^ iP) ■ The random variable 

^ is uniform and the size-biased picking random variable rj has distribution beta(2, 1). 

Concerning spacings, we have 

<^ (/?) = E ^ (^(fc-i),p. - kk),pj = / ^''^ C'^^) • 
fe=l -^0 

The tail of the structural measure for spacings reads 

/ a {dz) = :a{x)=Ej2^ (C(fc-i),P, - ^(fc),P, > 
"'^ fc=i 

recalling that F7 (x) = (1 — x)^ is the tail distribution function of uniform spac- 

^ S(fc),;p 

ings^(/c),)5 ■■= ^(k-i),p-^(k),p (with^(o),p := land^(p+i),p := 0), for any A: = l,..,p+l. 
□ 
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